The ggplot2 package provides a system for
- data graphics with a …
- professional look, that can …
- easily be layered and extended.
Decisions about axis ranges and "keys" are
- made sensibly out of the box and
- can be overridden manually.
TCRUG — November 20, 2014
The ggplot2 package provides a system for
Decisions about axis ranges and "keys" are
For graphing data, your choice is between lattice and ggplot
lattice provides many of the same features as ggplot.
* lattice has an easier basic syntax * ggplot has a more consistent notation
"Base graphics are good for drawing pictures; ggplot2 graphics are good for understanding the data." — Wickham, 2012
Describe the conceptual structure of ggplot commands and cover the basic vocabularly.
I want you to be able to read ggplot commands.
If you can read, you can decide what to copy. Most programming starts with copying something that already works.
Once you know these, you can read, which means you can extend the many examples available on the Internet.
geom: Something with graphical attributes, "aesthetics", to be shown in the frame.The frame is the meaning of space in the graphic.
A graphic requires a frame and one or more layers.
ggplot() creates a new frame for you to add layers to.
"I want to start a new graphic."
ex1 <- ggplot()
"Start it, holding the data for further reference, and assigning these variables to be represented by space."
ex2 <- ggplot( data=NHANES, aes( x=age, y=height ) )
If you've set up the frame with reference data and variables for space, just say what kind of glyph you want …
ex2 + geom_point( )
Within a layer, you can specify values for graphical characteristics. When these are the same for all cases, this is called "setting."
ex2 + geom_point( alpha=.3, color="blue" )
When the graphical characteristics of each glyph are to be properties based on individual cases in the data, you map a variable onto the property. This is signaled by stating the equivalence within aes().
# setting mapping ex2 + geom_point( alpha=.3, aes( color=sex ) )
## [1] "geom_abline" "geom_aesthetics" "geom_area" ## [4] "geom_bar" "geom_bin2d" "geom_blank" ## [7] "geom_boxplot" "geom_contour" "geom_crossbar" ## [10] "geom_density" "geom_density2d" "geom_dotplot" ## [13] "geom_errorbar" "geom_errorbarh" "geom_freqpoly" ## [16] "geom_hex" "geom_histogram" "geom_hline" ## [19] "geom_jitter" "geom_line" "geom_linerange" ## [22] "geom_map" "geom_path" "geom_point" ## [25] "geom_pointrange" "geom_polygon" "geom_quantile" ## [28] "geom_raster" "geom_rect" "geom_ribbon" ## [31] "geom_rug" "geom_segment" "geom_smooth" ## [34] "geom_step" "geom_text" "geom_tile" ## [37] "geom_violin" "geom_vline"
Smokers <- NHANES %>% filter( smoker=="yes" )
ex2 + # start a new graph geom_point( alpha=.2, aes( color=sex ) ) + # First layer # Override data why in aes()? geom_rug( data=Smokers, sides="r", alpha=.05, aes( color=sex ) ) + geom_rug( data=Smokers, sides="t", alpha=.05, aes( color=sex ), position="jitter" )
Geoms show individual cases in the data.
Stats show aggregate properties of the cases.
ex2 + stat_smooth( aes( color=sex ))
ex2 + stat_smooth( aes( color=sex )) + geom_point(alpha=.1, aes( color=sex ) )
basicPlot <- ex2 + stat_smooth( aes( color=sex )) + geom_point(alpha=.1, aes( color=sex ))
basicPlot + theme_economist()
basicPlot + theme_excel()
basicPlot + theme_wsj()
basicPlot + theme_tufte()
Some weather graphics here
aes(), e.g. geom_point( color=sex )aes(), e.g. geom_point( aes(color="blue") )Others:
Specialized vocabulary: